Search CORE

39 research outputs found

High performance Java for multi-core systems

Author: Ramos Garea Sabela
Publication venue
Publication date: 01/01/2013
Field of study

[Abstract] The interest in Java within the High Performance Computing (HPC) community has been rising during the last years thanks to its noticeable performance improvements and its productivity features. In a context where the trend to increase the number of cores per processor is leading to the generalization of many-core processors and accelerators, multithreading as an inherent feature of the language makes Java extremely interesting to exploit the performance provided by multi- and manycore architectures. This PhD Thesis presents a thorough analysis of the current state of the art regarding multi- and many-core programming in Java and provides the design, implementation and evaluation of several solutions to enable Java for the many-core era. To achieve this, a shared memory message-passing solution has been implemented to provide shared memory programming with the scalability of distributed memory paradigms, also with the benefits of a portable programming model that allows the developed codes to be run on distributed memory systems. Moreover, representative collective operations, involving computation and communication among different processes or threads, have been optimized, also introducing in Java new features for scalability from the MPI 3.0 specification, namely nonblocking collectives. Regarding the exploitation of many-core architectures, the lack of direct Java support forces to resort to wrappers or higher-level solutions to translate Java code into CUDA or OpenCL. The most relevant among these solutions have been evaluated and thoroughly analyzed in terms of performance and productivity. Guidelines for taking advantage of shared memory environments have been derived during the analysis and development of the proposed solutions, and the main conclusion is that the use of Java for shared memory programming on multi- and many-core systems is not only productive but also can provide high performance competitive results. However, in order to effectively take advantage of the underlying multi- and many-core architectures, the key is the availability of optimized middleware that abstracts multithreading details from the user, like the one proposed in this Thesis, and the optimization of common operations like collective communications

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Estudio morfodinámico de una playa lineal. Aplicación al caso de Gandía

Author: Martínez Ramos Sabela
Publication venue: 'Universitat Politecnica de Valencia'
Publication date: 20/01/2014
Field of study

El propósito de este trabajo es caracterizar la evolución espacio-temporal de una playa lineal, empleando para ello el software de simulación denominado Sistema de Modelado Costero (SMC) desarrollado por la Universidad de Cantabria (UC) y el Instituto de Hidráulica Ambiental de Cantabria (IH Cantabria). Como ejemplo de este tipo de playa se considerará el caso de la playa Norte de Gandía. Se pretende determinar, dentro de las limitaciones impuestas por los parámetros conocidos, la evolución previsible de la línea de costa, la influencia del oleaje, y las corrientes costeras, así como los tipos de estructuras sedimentarias que se forman. Este trabajo pretende caracterizar el tipo de playa que se está analizando (si es reflectiva o disipativa) así como determinar las estructuras sedimentarias que se generan en la playa y establecer su relación con las condiciones existentes del clima marítimo. Para ello se estudiará la reacción de la playa a diferentes eventos meteorológicos, desde condiciones de calma a casos excepcionales de temporal, y comparar para ver como la dinámica marina influye en la morfología costeraThe purpose of this project is to characterize the spatiotemporal evolution of a linear beach by employing simulation software called Coastal Modelling System (CMS), developed by the University of Cantabria and the Institute of Environmental Hydraulics of Cantabria. As an example of this type of beach the case of the beach Norte de Gandía is considered. The goal is to predict¿within the constraints imposed by the known parameters¿the evolution of the coastline, the influence of the surf, the coastal currents, and the types of sedimentary structures that form as a result. This work aims to characterize the type of beach that is being analyzed (if it is reflective or dissipative) and to determine the sedimentary structures that are generated on the beach and establish how they relate to the existing conditions of the maritime climate. In order to do this, we will study the reaction of the beach to different meteorological events from calm to extreme weather conditions and compare to gain insight about how maritime dynamics influence coastal morphology.Martínez Ramos, S. (2013). Estudio morfodinámico de una playa lineal. Aplicación al caso de Gandía. Universitat Politècnica de València. http://hdl.handle.net/10251/34979Archivo delegad

RiuNet

Design of efficient Java message-passing collectives on multi-core clusters

Author: Doallo Ramón
López Taboada Guillermo
Ramos Garea Sabela
Touriño Juan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/08/2010
Field of study

This is a post-peer-review, pre-copyedit version of an article published in The Journal of Supercomputing. The final authenticated version is available online at: https://doi.org/10.1007/s11227-010-0464-5[Abstract] This paper presents a scalable and efficient Message-Passing in Java (MPJ) collective communication library for parallel computing on multi-core architectures. The continuous increase in the number of cores per processor underscores the need for scalable parallel solutions. Moreover, current system deployments are usually multi-core clusters, a hybrid shared/distributed memory architecture which increases the complexity of communication protocols. Here, Java represents an attractive choice for the development of communication middleware for these systems, as it provides built-in networking and multithreading support. As the gap between Java and compiled languages performance has been narrowing for the last years, Java is an emerging option for High Performance Computing (HPC). Our MPJ collective communication library increases Java HPC applications performance on multi-core clusters: (1) providing multi-core aware collective primitives; (2) implementing several algorithms (up to six) per collective operation, whereas publicly available MPJ libraries are usually restricted to one algorithm; (3) analyzing the efficiency of thread-based collective operations; (4) selecting at runtime the most efficient algorithm depending on the specific multi-core system architecture, and the number of cores and message length involved in the collective operation; (5) supporting the automatic performance tuning of the collectives depending on the system and communication parameters; and (6) allowing its integration in any MPJ implementation as it is based on MPJ point-to-point primitives. A performance evaluation on an InfiniBand and Gigabit Ethernet multi-core cluster has shown that the implemented collectives significantly outperform the original ones, as well as higher speedups when analyzing the impact of their use on collective communications intensive Java HPC applications. Finally, the presented library has been successfully integrated in MPJ Express (http://mpj-express.org), and will be distributed with the next release.Ministerio de Ciencia e Innovación; TIN2010-16735Ministerio de Educación; FPU; AP2009-2112Xunta de Galicia; PGIDIT06PXIB105228P

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Parallel Pairwise Epistasis Detection on Heterogeneous Computing Architectures

Author: González-Domínguez Jorge
Ramos Garea Sabela
Schmidt Bertil
Touriño Juan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

This is a post-peer-review, pre-copyedit version of an article published in IEEE Transactions on Parallel and Distributed Systems. The final authenticated version is available online at: http://dx.doi.org/10.1109/TPDS.2015.2460247.[Abstract] Development of new methods to detect pairwise epistasis, such as SNP-SNP interactions, in Genome-Wide Association Studies is an important task in bioinformatics as they can help to explain genetic influences on diseases. As these studies are time consuming operations, some tools exploit the characteristics of different hardware accelerators (such as GPUs and Xeon Phi coprocessors) to reduce the runtime. Nevertheless, all these approaches are not able to efficiently exploit the whole computational capacity of modern clusters that contain both GPUs and Xeon Phi coprocessors. In this paper we investigate approaches to map pairwise epistasic detection on heterogeneous clusters using both types of accelerators. The runtimes to analyze the well-known WTCCC dataset consisting of about 500 K SNPs and 5 K samples on one and two NVIDIA K20m are reduced by 27 percent thanks to the use of a hybrid approach with one additional single Xeon Phi coprocessor.Wellcome Trust; 076113Wellcome Trust; 085475Ministerio de Ecnomía y Competitividad; TIN2013-42148-

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Tinjauan Kelengkapan Pengisian Formulir Persetujuan Tindakan Kedokteran Pasien Bedah Rawat Inap Di Rumah Sakit Tere Margareth Tahun 2022

Author: Nababan Edward Ramos
Ritonga Zulham Andi
Sabela Hasibuan Ali
Simanjuntak Marta
Publication venue: Akademi Perekam dan Informasi Kesehatan Imelda
Publication date: 31/08/2023
Field of study

Completeness of filling in the informed consent form is very important because it can affect the quality of medical records and the legal aspects contained in the medical records themselves. This type of research is descriptive with an observational approach, namely research that describes the current situation. The study population was the medical record document of the informed consent form of inpatients with the sample in this study being a portion of the total population. An overview of the completeness of filling out informed consent sheets in surgical cases at Tere Margareth General Hospital can be seen from the sample count with a total population of (235) divided by 1+235 (precision level/10%=0.1) which results in a sample of 70. Completeness of filling in the identification of providing information is 97% filled and 3% not filled. Completeness of filling in important report items is 92% filled and 8% not filled. Completeness of filling in medical action items is 98% filled and 2% not filled. Completeness of filling in authentication items is 87% filled and 3% not filled. The medical record unit is trying to be able to ask the nurse in charge to fill out the informed consent form so that it can fill it out completely

Direktori Jurnal Elektronik Universitas Imelda Medan

FastMPJ: a scalable and efficient Java message-passing library

Author: Doallo Ramón
Expósito Roberto R.
López Taboada Guillermo
Ramos Garea Sabela
Touriño Juan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

This is a post-peer-review, pre-copyedit version of an article published in Cluster Computing. The final authenticated version is available online at: http://dx.doi.org/https://doi.org/10.1007/s10586-014-0345-4[Abstract] The performance and scalability of communications are key for high performance computing (HPC) applications in the current multi-core era. Despite the significant benefits (e.g., productivity, portability, multithreading) of Java for parallel programming, its poor communications support has hindered its adoption in the HPC community. This paper presents FastMPJ, an efficient message-passing in Java (MPJ) library, boosting Java for HPC by: (1) providing high-performance shared memory communications using Java threads; (2) taking full advantage of high-speed cluster networks (e.g., InfiniBand) to provide low-latency and high bandwidth communications; (3) including a scalable collective library with topology aware primitives, automatically selected at runtime; (4) avoiding Java data buffering overheads through zero-copy protocols; and (5) implementing the most widely extended MPI-like Java bindings for a highly productive development. The comprehensive performance evaluation on representative testbeds (InfiniBand, 10 Gigabit Ethernet, Myrinet, and shared memory systems) has shown that FastMPJ communication primitives rival native MPI implementations, significantly improving the efficiency and scalability of Java HPC parallel applications.Ministerio de Educación y Ciencia; AP2010-4348Ministerio de Economía y Competitividad; TIN2010-16735Xunta de Galicia; CN2012/211Xunta de Galicia; GRC2013/05

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

GKD: Generalized Knowledge Distillation for Auto-regressive Sequence Models

Author: Agarwal Rishabh
Bachem Olivier
Geist Matthieu
Ramos Sabela
Stanczyk Piotr
Vieillard Nino
Publication venue
Publication date: 23/06/2023
Field of study

Knowledge distillation is commonly used for compressing neural networks to reduce their inference cost and memory footprint. However, current distillation methods for auto-regressive models, such as generative language models (LMs), suffer from two key issues: (1) distribution mismatch between output sequences during training and the sequences generated by the student during its deployment, and (2) model under-specification, where the student model may not be expressive enough to fit the teacher's distribution. To address these issues, we propose Generalized Knowledge Distillation (GKD). GKD mitigates distribution mismatch by sampling output sequences from the student during training. Furthermore, GKD handles model under-specification by optimizing alternative divergences, such as reverse KL, that focus on generating samples from the student that are likely under the teacher's distribution. We demonstrate that GKD outperforms commonly-used approaches for distilling LLMs on summarization, machine translation, and arithmetic reasoning tasks.Comment: First two authors contributed equall

arXiv.org e-Print Archive

Performance analysis of HPC applications in the cloud

Author: Doallo Ramón
Expósito Roberto R.
López Taboada Guillermo
Ramos Garea Sabela
Touriño Juan
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

[Abstract] The scalability of High Performance Computing (HPC) applications depends heavily on the efficient support of network communications in virtualized environments. However, Infrastructure as a Service (IaaS) providers are more focused on deploying systems with higher computational power interconnected via high-speed networks rather than improving the scalability of the communication middleware. This paper analyzes the main performance bottlenecks in HPC application scalability on the Amazon EC2 Cluster Compute platform: (1) evaluating the communication performance on shared memory and a virtualized 10 Gigabit Ethernet network; (2) assessing the scalability of representative HPC codes, the NAS Parallel Benchmarks, using an important number of cores, up to 512; (3) analyzing the new cluster instances (CC2), both in terms of single instance performance, scalability and cost-efficiency of its use; (4) suggesting techniques for reducing the impact of the virtualization overhead in the scalability of communication-intensive HPC codes, such as the direct access of the Virtual Machine to the network and reducing the number of processes per instance; and (5) proposing the combination of message-passing with multithreading as the most scalable and cost-effective option for running HPC applications on the Amazon EC2 Cluster Compute platform.Ministerio de Ciencia e Innovación; TIN2010-16735Ministerio de Economía y Competitividad; AP2010-4348

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Design of Scalable Java Communication Middleware for Multi-Core Systems

Author: Doallo Ramón
Expósito Roberto R.
López Taboada Guillermo
Ramos Garea Sabela
Touriño Juan
Publication venue: 'Oxford University Press (OUP)'
Publication date: 05/09/2012
Field of study

This is a post-peer-review, pre-copyedit version of an article published in The Computer Journal. The final authenticated version is available online at: https://doi.org/10.1093/comjnl/bxs122[Abstract] This paper presents smdev, a shared memory communication middleware for multi-core systems. smdev provides a simple and powerful messaging application program interface that is able to exploit the underlying multi-core architecture replacing inter-process and network-based communications by threads and shared memory transfers. The performance evaluation of smdev on several multi-core systems has shown noticeable improvements compared with other Java shared memory solutions, reaching and even overcoming the performance of natively compiled libraries. Thus, smdev has obtained start-up latencies around 0.76 μs and almost 90 Gbps bandwidth for point-to-point communications, as well as high performance and scalability both for collective operations and representative messaging kernels. This fact has motivated the integration of smdev in F-MPJ, our message-passing implementation in Java.Ministerio de Ciencia e Innovación; TIN2010-1673

Repositorio da Universidade da Coruña

Crossref

Evaluation of messaging middleware for high-performance cloud computing

Author: Doallo Ramón
Expósito Roberto R.
López Taboada Guillermo
Ramos Garea Sabela
Touriño Juan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/09/2012
Field of study

This is a post-peer-review, pre-copyedit version of an article published in Personal and Ubiquitous Computing. The final authenticated version is available online at: http://dx.doi.org/10.1007/s00779-012-0605-3[Abstract] Cloud computing is posing several challenges, such as security, fault tolerance, access interface singularity, and network constraints, both in terms of latency and bandwidth. In this scenario, the performance of communications depends both on the network fabric and its efficient support in virtualized environments, which ultimately determines the overall system performance. To solve the current network constraints in cloud services, their providers are deploying high-speed networks, such as 10 Gigabit Ethernet. This paper presents an evaluation of high-performance computing message-passing middleware on a cloud computing infrastructure, Amazon EC2 cluster compute instances, equipped with 10 Gigabit Ethernet. The analysis of the experimental results, confronted with a similar testbed, has shown the significant impact that virtualized environments still have on communication performance, which demands more efficient communication middleware support to get over the current cloud network limitations.Ministerio de Ciencia e Innovación; TIN2010-16735Ministerio de Educación y Ciencia; AP2010-434

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref